<
World-Wide Web> (URL, previously "Universal") A
standard
way of specifying the location of an object, typically a {web
page}, on the
Internet. Other types of object are described
below. URLs are the form of address used on the {World-Wide
Web}. They are used in
HTML documents to specify the target
of a
hyperlink which is often another HTML document
(possibly stored on another computer).
Here are some example URLs:
http://w3.org/default.html
http://acme.co.uk:8080/images/map.gif
http://foldoc.org/?
Uniform+
Resource+
Locator
http://w3.org/default.html#Introduction
ftp://wuarchive.wustl.edu/mirrors/msdos/graphics/gifkit.zip
ftp://spy:secret@ftp.acme.com/pub/topsecret/weapon.tgz
mailto:fred@doc.ic.ac.uk
news:alt.hypertext
telnet://dra.com
The part before the first colon specifies the access scheme or
protocol. Commonly implemented schemes include:
ftp,
http (World-Wide Web),
gopher or
WAIS. The "file"
scheme should only be used to refer to a file on the same
host. Other less commonly used schemes include
news,
telnet or mailto (
e-mail).
The part after the colon is interpreted according to the
access scheme. In general, two slashes after the colon
introduce a
hostname (host:port is also valid, or for
FTP
user:passwd@host or user@host). The
port number is usually
omitted and defaults to the standard port for the scheme,
e.g. port 80 for HTTP.
For an HTTP or FTP URL the next part is a
pathname which is
usually related to the pathname of a file on the server. The
file can contain any type of data but only certain types are
interpreted directly by most
browsers. These include
HTML
and images in
gif or
jpeg format. The file's type is
given by a
MIME type in the HTTP headers returned by the
server, e.g. "text/html", "image/gif", and is usually also
indicated by its
filename extension. A file whose type is
not recognised directly by the browser may be passed to an
external "viewer"
application, e.g. a sound player.
The last (optional) part of the URL may be a query string
preceded by "?" or a "fragment identifier" preceded by "#".
The later indicates a particular position within the specified
document.
Only alphanumerics, reserved characters (:/?#"<
>%+) used for
their reserved purposes and "$", "-", "_", ".", "&", "+" are
safe and may be transmitted unencoded. Other characters are
encoded as a "%" followed by two
hexadecimal digits. Space
may also be encoded as "+". Standard
SGML "&<
name>;"
character entity encodings (e.g. "é") are also accepted
when URLs are embedded in HTML. The terminating semicolon may
be omitted if &<
name> is followed by a non-letter character.
{
The authoritative W3C URL specification
(http://w3.org/hypertext/WWW/Addressing/Addressing.html)}.
(2000-02-17)